Sample size and informetric model goodness-of-fit outcomes: a search engine log case study
نویسندگان
چکیده
The influence of sample size on informetric characteristics is examined to determine whether theoretical mathematical models can adequately fit large data sets. Two large data sets of queries submitted to the Excite search service were sampled for search characteristics (term frequencies, terms used per query, pages viewed per query, queries submitted per session) producing data sets of various sizes that were fitted to theoretical models to determine how the sample may influence a model’s goodness-of-fit. Although theoretical models could adequately fit smaller data sets of up to 5000 observations in some cases, larger data sets could not be satisfactorily fitted using several goodness-of-fit techniques. Investigators must take into account that sample size does influence goodness-of-fit outcomes. The nature of the data and not the limitations of given goodness-of-fit tests results in significant outcomes. Such goodness-of-fit tests should be used for comparative purposes, rather than significance testing.
منابع مشابه
Relationship between the Online Social Networks Addiction and Psychological Disorders
Background: The Online social networks addiction like others type of addiction can lead to ethical dilemmas, as well as it can be affected from psychological disorders. So, the aim of this research is to analyze the effect of depression, anxiety and usage time of online social networks on the level of online social networks addiction and on the life satisfaction. Method: The method of research ...
متن کاملModelling survival data to account for model uncertainty: a single model or model averaging?
ABSTRACT This study considered the problem of predicting survival, based on three alternative models: a single Weibull, a mixture of Weibulls and a cure model. Instead of the common procedure of choosing a single "best" model, where "best" is defined in terms of goodness of fit to the data, a Bayesian model averaging (BMA) approach was adopted to account for model uncertainty. This was illustra...
متن کاملساخت و اعتباریابی مقیاس مهارتهای درون فردی و بین فردی زوجها
The purpose of the present study is to develop and evaluate psychometric properties of the Intrapersonal and Interpersonal Skills of Couples Scale (IISCS). In terms of testing, the research method was of descriptive type and the sampled statistical population included married men and women living in the city of Tehran, Iran. In this respect, 470 Iranian married men and women (277 women and 193 ...
متن کاملOn the Canonical-Based Goodness-of-fit Tests for Multivariate Skew-Normality
It is well-known that the skew-normal distribution can provide an alternative model to the normal distribution for analyzing asymmetric data. The aim of this paper is to propose two goodness-of-fit tests for assessing whether a sample comes from a multivariate skew-normal (MSN) distribution. We address the problem of multivariate skew-normality goodness-of-fit based on the empirical Laplace tra...
متن کاملHeight and Crown Area Distribution of Cionura erecta Shrub lands in chaharmahal and Bakhtiari Province, Using Probability Distribution Functions
Importance of probability distribution functions in natural resource studies is increasing due to their effective roles in better understanding of vegetation structure and providing conceptual models of quantitative indices of plant species. The present study was performed to model the distribution of height and canopy area of Cionura erecta L. shrub, using probability distribution functions in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Information Science
دوره 32 شماره
صفحات -
تاریخ انتشار 2006